Audio-Kit reference manual
This user manual describes how to use X-CUBE-AUDIO-KIT, the STM32 ecosystem for audio flow edition and tuning.
On the device side, the firmware infrastructure called AudioChain allows the implementation of data flows in a generic manner.
On the computer side, a graphical interface called LiveTune allows the design and tuning of data flows and the dataflow code generation to produce the release version of the application.
The package also integrate an algorithm library, and dataflow examples. This documentation describes the STM32 ecosystem for audio flow edition and tuning.
General information
This document describes the X-CUBE-AUDIO-KIT package and middleware and how to use them. It does not explain the audio processing concepts. There is no need to be audio processing expert to build and tune dataflows with this ecosystem. If a user needs more information, when implementing a specific algorithm for example, literature on the subject is huge.
Table 1 presents the definition of acronyms and words that are relevant for a better understanding of this document.
| Term | Definition |
|---|---|
| acChunk | LiveTune audio bus |
| acGraph | LiveTune graphical bus |
| acMsg | LiveTune message bus |
| Algorithm | Audio-handling plugin |
| AudioChain | Multipath audio processing framework |
| Chunk | Multiframe buffer to connect algorithms |
| CPU | Central processing unit (Arm® Cortex®) |
| Data flow | A set of elements connected by a multiframe buffer |
| Element | A part of a data flow |
| Grph | LiveTune graphical pin |
| HAL | Hardware Abstraction Layer |
| IDE | Integrated development environment |
| JSON | JavaScript Object Notation |
| In | LiveTune input audio pin |
| LiveTune | A designer canvas to build a data flow graphically |
| Msg | LiveTune message pin |
| Out | LiveTune output audio pin |
| UART | Universal asynchronous receiver transmitter |
| USB | Universal serial bus |
Table 1 - Definitions and acronyms
System overview
AudioChain overview
AudioChain is a multi-path audio processing framework. It offers a generic way of building simple to complex audio flows made of different processing blocks.
AudioChain feeds from audio buffers and provides output buffers. Thus, it is independent of hardware and BSP source code. It can work on many different types of audio data:
temporal or spectral domain,
PCM (Pulse Code Modulation) fixed or floating point, PDM (Pulse Density Modulation) 1-bit LSB first or MSB first,
interleaved or not,
mono, stereo, or wider range of channels,
etc….
AudioChain supports a wide variety of audio-handling components, including simple routing blocks, gains, audio playback processing, to more advanced signal cleaning blocks targeting use cases such as voice recognition systems, voice communication systems, rendering (compression, equalization, etc….) or others.
These elements are provided as plugins called “algorithms” that the user can add and remove to create a complete data flow. It is also possible to encapsulate external or third-party processing blocks to insert them in an AudioChain data flow.
Connecting algorithms together is made thru multi-frame buffers called “chunks”. As said, AudioChain doesn’t encapsulate HW nor audio BSP. Thus, it needs some specific “chunks” to feed from any HW (microphone or other) and to provide samples to audio HW outputs (audio codec, USB, etc….). These specific chunks are called system chunks. More details are given in a dedicated section called AudioChain connection to HW.
Mind that the dataflow can be implemented using LiveTune or also writing C-code manually.
LiveTune Overview
LiveTune is a designer canvas to build a dataflow graphically. The result of the design can be tested in three ways:
immediately from the tool
- using the “start” button.
flashing the design from the tool:
using the “flash” button,
the initial executable file that allows to use LiveTune designer doesn’t necessarily implement any dataflow. As long as no dataflow has been flashed, it doesn’t execute any audio processing.
The advantage of this way of working is to rapidly flash one dataflow or another.
generating the c-code version of the dataflow.
using the “Generate code” button,
This way of working allows to:
run AudioChain without LiveTune,
generate executable that implements the dataflow natively,
easily share an executable,
generate a final firmware of a product.
LiveTune uses an HTML5 compliant browser such as chrome or edge. It doesn’t need any server. Communication with the device is implemented by exchanging json-based messages. As of today, the communication is implemented using virtual com port over usart.
LiveTune also offers a terminal that allows to:
Get feedback trace from the device,
Interact with the device thru commands. Any UI event can be done thru command lines. Also traces & cycles count (de-)activation, USB record data flow selection etc…. can be done thru commands. Of course, an “help” command is available that will list all possible interactions.
Following figure provides an overall aspect of LiveTune.
Upon a successful connection to a device, LiveTune will display a left panel including algorithms plugins available for the dataflow construction. The list of elements exposed in LiveTune depends on the Firmware build. See also the Developer API chapter for more explanation.
some buttons previously grayed or hidden will appear according to board FW configuration,
the “EDIT” view will be editable,
the “TERMINAL VIEW” will start logging traces. It will also allow interaction thru command line.
Connection is done using the “Connect” button and will be successful only if the firmware running on the device includes the Designer communication firmware. It includes different utilities source code for json, uart management, traces, and an encapsulation of AudioChain to interface it with LiveTune.
Software Overview
Following figures illustrates a high-level architecture view of an application implemented around AudioChain.
LiveTune and AudioChain communicates through json commands. Transmission is handled over UART. On top of BSP and HAL drivers, ST offers several c-code utilities modules useful to ease AudioChain integration:
Audio: filename = stm32_audio.[ch] offers a common API to manage different audio HW.
Terminal: filename = stm32_term.[ch] & stm32_usart.[ch] offers a terminal console which allows easy implementation of custom commands to communicate through UART.
STJson: filename = st_json.[ch] offers an API to read and write json commands.
AudioChain: described in this document. As illustrated, it can encapsulate third party algorithms (yellow block).
Memory & CPU load
LiveTune runs on PC with an enormous memory compared to the board. It is possible to design a dataflow using thousands of elements without paying attention of the board CPU and memory limitations. The board exposes some terminal command lines to check this status and manage limitations.
“mem” terminal command prints the memory status (with algos memory usage summary if the data flow is started).
“mem2” terminal command prints the memory status (with detailed algos memory usage if the data flow is started).
“cpu” terminal command prints the CPU load status (with detailed algos CPU load if the data flow is started).
“task” terminal command prints the tasks names and status.
Audio configuration
When the livetune FW is active, it is possible to change the audio configuration from the terminal.
Hence, we can change microphone, input output frequency with the command.
“ac audio” : List all configurations available.
“ac audio
String identificator
As shown in the previous figure, each AC audio configuration has
a dedicated string ID that looks like this:
ID_02160216onb080016FF. The current AC audio
configuration is displayed in the LiveTune footer as
Hardware Configuration: ID_xxxx.
Here is a detailed description of this string format:
ID_%02X%02d%02X%02d%c%c%c%02d%02X%02d%02X:
- Number of audio output channels
- Output sampling frequency in kHz
- Number of audio input channels
- Input sampling frequency in kHz
- Three-letter identifier:
onb: stands for onboardstv: stands for STEVAL (microphones connected to the 20-pin connector)con: stands for connected. On some STM32 devices, there is automatic detection when an expansion is connected to the 20-pin connector. In this case, there is no need to change the AC audio configuration; plugging and unplugging the microphone expansion will transition between expansion and onboard microphones.
- Frame size of the AudioChain processing in ms
- Enable/disable of the AudioChain low latency mode
- Number of bits per sample
0xFF: reserved for future information
Why does the PDM source disappear (SysIn-MicroPDM)?
On some STM32 devices, the digital microphones may be connected to different IPs depending on whether they are the onboard microphone or the expansion microphones connected through the 20-pin audio connector. For instance, on the STM32H735G-DK, the onboard microphone is connected to the SAI-PDM, while the expansion microphones are connected to the DFSDM.
In this package, microphone capture is managed by the
Utilities/Audio component. When capturing through
DFSDM, it delivers PCM samples. However, when capturing through
SAI-PDM, it can deliver either PDM, PCM, or both PDM and PCM
samples. For PCM samples, the decimation is performed by the PDM2PCM
software library.
In the current version, it delivers both PDM and PCM samples.
Therefore, switching the ac audio config from onboard
to expansion microphones will result in the presence or absence of
the SysIn-MicroPDM input, while the
SysIn-Microphones input, which corresponds to PCM data,
remains present at all times.
The advantage of keeping SysIn-Microphones is to
maintain common data flows for both onboard and expansion
microphones. The advantage of having SysIn-MicroPDM in
LiveTune is the ability to: - Apply delays in the PDM domain. -
Replace the decimation with the CIC filter (similar to DFSDM
processing).
Trace activation
By default, AudioChain shows only error traces. However, the trace level can be adjusted using dedicated commands. There are two types of traces:
LiveTune traces using the “trace” command:
Command = “trace help” will print all possibilities
Allows to set trace level; for instance, “trace set warning” will activate warnings.
Allows to activate json info in traces; for instance, “trace set json”
Allows to tune format & format of traces; for instance “trace set colorize”
AudioChain traces using the “ac” command
Command = “ac help” will print all possibilities.
Command = “ac set log_init” will print logging information about AudioChain such as illustrated below.
Command = “ac set log_mallocs” will print all malloc and free done by the pieces of SW in the FW that use the APIs AudioMalloc & AudioFree.
Command = “ac set log_os” will print information about OS tasks (priority, mem stack usage, etc…) when it is changing
Command = “ac set log_cycles” will activate periodic cycles consumption prints. For cycles consumption, the command “cpu” might be easier as it provides a summarized report of CPU loads.
AudioChain Info : hChunk_cnx_2 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: user IN/OUT, nbFrames=2, nbSamples=256
AudioChain Info : hChunk_cnx_4 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: user IN/OUT, nbFrames=2, nbSamples=256
AudioChain Info : hChunk_cnx_5 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: user IN/OUT, nbFrames=2, nbSamples=256
AudioChain Info : hChunk_cnx_1 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: user IN/OUT, nbFrames=2, nbSamples=256
AudioChain Info : hChunk_cnx_3 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: user IN/OUT, nbFrames=2, nbSamples=256
AudioChain Info : SysInChunk1 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: system IN, nbFrames=2, nbSamples=256
AudioChain Info : SysOutChunk2 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: system OUT, nbFrames=3, nbSamples=384
AudioChain Info : SysOutChunk1 Initialized & allocated
AudioChain Info : => BuffInfo: nbChannels=2, nbSamples=128, fs=16000, duration=8000000ns, sample=2-byte, time sample, fixed-point 16, interleaved
AudioChain Info : => ChunkInfo: system OUT, nbFrames=3, nbSamples=384
AudioChain Info : Audio Graph:
AudioChain Info : * signal-generator "signal-generator-1" (id 0):
AudioChain Info : - output chunk(s):
AudioChain Info : . hChunk_cnx_2 -> split "split-1" (id 1) (in 0)
AudioChain Info : * split "split-1" (id 1):
AudioChain Info : - input chunk(s):
AudioChain Info : . hChunk_cnx_2 <- signal-generator "signal-generator-1" (id 0) (out 0)
AudioChain Info : - output chunk(s):
AudioChain Info : . hChunk_cnx_4 -> route "route-1" (id 4) (in 0)
AudioChain Info : . hChunk_cnx_5 -> mix "mix-1" (id 3) (in 0)
AudioChain Info : . SysOutChunk2
AudioChain Info : * hpf "hpf-1" (id 2):
AudioChain Info : - input chunk(s):
AudioChain Info : . SysInChunk1
AudioChain Info : - output chunk(s):
AudioChain Info : . hChunk_cnx_1 -> mix "mix-1" (id 3) (in 1)
AudioChain Info : * mix "mix-1" (id 3):
AudioChain Info : - input chunk(s):
AudioChain Info : . hChunk_cnx_5 <- split "split-1" (id 1) (out 1)
AudioChain Info : . hChunk_cnx_1 <- hpf "hpf-1" (id 2) (out 0)
AudioChain Info : - output chunk(s):
AudioChain Info : . hChunk_cnx_3 -> route "route-1" (id 4) (in 1)
AudioChain Info : * route "route-1" (id 4):
AudioChain Info : - input chunk(s):
AudioChain Info : . hChunk_cnx_4 <- split "split-1" (id 1) (out 0)
Host <=> device communication
LiveTune is a slave. Therefore, all information displayed is coming from the board. Such information is transmitted during the connection process, the edition as well as when a dataflow is running. Information exchanges are done using JSON format.
Physically, the UART is used for communication between the PC and the board. Today, the UART setting is 921600:8:N:1.
The board uses 2 kinds of transmissions:
regular TTY transmission used by the terminal.
packets transmission. Transmission packets are displayed by the LiveTune terminal. Packets are used exclusively to transmit error messages, heartbeat, data etc….. A packet message starts with “start code” and ends with a “terminal code”.
LiveTune uses the terminal command line to communicate with the board. For example, when the user pushes on the button “Connect” thru LiveTune interface, it executes a routine on board that will transmit packets including the list of elements, and the project saved in its persistent storage.
The device and LiveTune communicates by CLI commands. For instance, the user can call the “get_elements” command to get the list of processing available on the firmware currently running.
Memory allocation policy control
Memory allocation is done thanks to the component called STPMem.
STPmem description
It provides an API that implements a custom memory pool
management using standard allocation functions (malloc, realloc,
calloc, free). The API implement also some tools to detect memory
corruptions and can mark blocks with name or a tag.
Blocks can support bytes alignements 4 8 16 32.
This component is very useful to take benefit of all the memory available on a device even if it is fragmented into several physical blocks (RAM, RAM1, RAM2, SDRAM etc…). However it’s usage can be tricky, especially when writing the linker file.
In order to locate usage of this component in the source, please
grep the define ST_USE_PMEM.
In order to get an idea of the pool created, please look for the
function called st_os_mem_create_pool.
For instance, this call will create a 32KBytes pool inside the RAM at address 0x30000000UL:
st_os_mem_create_pool(ST_Mem_Type_POOL1, 0x30000000UL, 32U, "RAM2", 0);
Linker file configuration
It is mandatory to take good care at the linker file when using STPmem. Otherwise, strange behavior may happen.
For instance, assuming that:
- The linker file is using the memory
0x30000000ULfor whatever purpose (section creation, .data, Cstack, etc…). ST_Mem_Type_POOL1was created as shown above,- The call st_os_mem_create_pool will memset this memory zone and your data will be erased ==> this will create very awkward behavior.
Memory pool configuration
The ac chunk_pool <n> terminal command allows
you to choose the memory pool for allocating audio chunks (i.e.,
audio buffers linking algorithms in an audio data flow).
The ac algo_pool <n> terminal command allows
you to choose the memory pool for allocating algorithm handles and
configuration structures (i.e., the generic part of algorithm
memory).
The memory pool options are: - n=0 for Tightly
Coupled Memory (TCM): The fastest memory, but often the smallest in
size. - n=1 for Internal RAM. - n=2 for
External RAM: The slowest memory, but often the largest in size.
Using the ac chunk_pool or ac algo_pool
commands without the <n> parameter will return
the current status.
Note: Executing the
ac chunk_pool <n> and
ac algo_pool <n> terminal commands will stop the
audio data flow if it is running.
Additionally, most algorithms have an added ramType
parameter for allocating specific buffers. For more details, see Algorithms
Memory Pool Allocation.
Depending on the platform, the availability of these three types of memory (TCM, Internal RAM, and External RAM) may vary. If the requested memory pool is not available or if there is insufficient remaining memory, the memory allocator will attempt to allocate memory from another pool using the following fallback algorithm:
- From the fastest to the slowest memory if TCM or Internal RAM is requested.
- From the slowest to the fastest memory if External RAM is requested.
When the audio data flow is started, all these buffers are
allocated. The mem2 terminal command then displays
detailed information regarding the memory allocation of chunks and
algorithms, including the requested and actually allocated memory
pools.
Note: Algorithms always request a small amount of memory in External RAM for a description string, as there is no performance constraint for this memory. Users cannot change this memory pool.
LiveTune description
LiveTune is built around three notions:
Elements
Pins
Buses
LiveTune elements
AudioChain uses the concept of dataflow. An audio dataflow is the set of elements connected by a multi-frames buffer. LiveTune uses several types of elements symbolized by dedicated icons. An element represents a connectable item. An element can be:
an algorithm or processing block
some are already available in AudioChain
others can be external or third-party solution (see Speex echo canceller for instance. In this case a proper wrapping/integration was done)
a sink or a source element to handle connection with HW,
a tool (running on PC only) allowing to display information typically coming from algorithms’ control call-backs.
An element has its own capabilities. These capabilities can be fully restricted meaning that the element will support:
a single sampling frequency,
a single number of channels,
a single number of input & output buses,
a single type of interleaving format,
a single format of sample (float, q15, q31, etc…)
temporal or spectral data.
On the contrary, some elements may support different tuning of these “parameters”.
| Icon | Type | Comment |
|---|---|---|
|
|
AudioChain Algorithm element | It is a processing block. It processes or generates or analyses audio data provided and delivered in the shape of audio chunks. It may also generate control events/notifications. |
|
|
AudioChain Sink element | It is an element connected to the hardware or an audio output. Most of the time, it is a bus connected to a loudspeaker, an USB or ethernet. However, other sink components may be added. |
|
|
AudioChain Src element | It is an element connected to the hardware or an audio Source. Most of the time it is the bus connected to a microphone, an USB or ethernet. However, other sink components may be added. |
|
|
LiveTune tool | It is an element not using an AudioChain bus. It takes the form of a visualizer or any independent tool. |
Table 1 - LiveTune Icons
LiveTune pin types
Creating a dataflow involves connecting several elements from a source pin to a destination pin. Each element exposes named pins.
There are 3 different types of pins:
Audio pins: often called “In” & “Out” by default. They may have specific name for instance “Err” for the “nlms” element.
Msg: Connectable to an acMsg bus, control data that can be displayed by message viewers,
Grph: Connectable to an acGraph bus, data that can be displayed by graphical viewers,
Elements can have different capabilities. The connection between two elements needs consistency between their capabilities. LiveTune detects most cases and prevent the error. Other errors are detected at start time by the firmware.
LiveTune bus types
Connections between elements are using different types of buses according to the type of the pins. There are 3 types of buses:
acChunk bus is used to connect audio pins together. This type of bus is internal to AudioChain. The term “chunk” is referring to AudioChain’s firmware. The audio_chunk_t are mutli-frame audio buffers structures.
acGraph bus is used to connect “Grph” pins together. It sends data to a graphical viewer. This bus is not internal to AudioChain. It is a connection between the device and LiveTune. It means that the data are transiting over UART from the device to LiveTune.
- acMsg bus is used to connect “Msg” pins together. It connects text data to a message viewer. This bus is not internal to AudioChain. It is a connection between the device and LiveTune. It means that the data are transiting over UART from the device to LiveTune.
Thus, it is not allowed to connect audio pins to Msg or Grph ones. Authorized connections are:
Audio pins to audio pins using acChunk bus,
Grph pins to Grph pins using acGraph bus,
Msg pins to Msg pins using acMsg bus.
In order to send audio data to a graphical or message viewer, a bus converter must be used such as the element called “capture”. Following figures illustrates this “capture” element.
LiveTune - Howto build a dataflow
Building a dataflow consists of creating element instances, connect and tune these elements. To build an element instance, you must select an element from the panel elements, drag and drop this element on the Graph View.
Mind that flying over the element panel will pop up a tooltip with a short element description.
To delete an element, user must right click on the element instance. Element is removed by clicking on the cross. Selecting the element and hitting the delete key is also functional.
Connecting elements in LiveTune
To build a connection, the user must click on the source pin and drag the connection onto the destination pin. The connection is then created when the mouse button is released.
In case the connection is revoked, please look at the terminal. An inconsistency between pins might have been raised and logged. In that case, the user might consider using an intermediate element such as the SFC (Signal Format Converter), or a router or any other block that can convert formats.
To delete a connection, user must right click on the connection wire, the delete icon will pop up. Connection will be removed by clicking on the cross (see next figure).
An element can expose several pins with various names. A tooltip is deployed when the mouse flies over the pin. The message gives some information concerning the number of connections accepted or the limitation of this pin. Following figure illustrates this for the processing block called “splitter”.
An element description is available passing the mouse pointer above the icon:
For simple algorithm it is just a short description whereas for more complex ones it is a link towards its documentation.
The same mechanism of tooltip is available for connections. It specifies the bus type, the connection name and optionally if some parameters can be changed (i.e. several capabilities are supported).
NO_CHANGE means that the acChunk parameters are inherited from the source element; this behavior is referred as “capability propagation”. For instance, assuming a router takes as input some floating point acChunk and assuming it delivers into an output acChunk that has bufferType = NO_CHANGE; the data type will be unchanged. On the contrary, if the user changes the bufferType = ABUFF_FORMAT_FIXED16, then the data type will be converted.
Changing the connection settings is done by clicking on the connection wire. Following figure shows the type of popup dialog box that allows tuning the connection.
How to use the propagation capability
In most cases, NO_CHANGE setting works fine. However, input vs output acChunk settings can be part of the algorithm’s configuration and must be updated according to the expected behavior. There are two situations to distinguish:
Optional acChunk configuration
Mandatory acChunk configuration
Optional acChunk configuration
Some algorithms accept input vs output buffer conversion as an option. Please refer to algorithm dedicated documentation to know if such conversion is supported. Using such conversion can be beneficial for optimization purpose (some algorithms work better with specific buffer format for instance).
These conversions are based on the SFC (Samples Format Converter) algorithm meaning that any algorithm that offer these conversions encapsulate the SFC. Please refer to its dedicated documentation that explains in detail how to use acChunk configuration to perform such conversion. In the following figure, SFC is used to convert FIXED16 & interleaved samples into float that are non-interleaved.
Here is a non-exhaustive list of algorithms that offers such features:
SFC
gain
mix / linear_mix
mono2stereo / stereo2mono
route
split
switch
Mandatory acChunk configuration
Some algorithms require input vs output buffer conversion for compatibility with next algorithm. Source of compatibility issue can be :
format, for instance:
It is not possible to connect an acChunk with FIXED16 to algorithm that supports only floating point.
It is not possible to connect an acChunk with temporal domain data to an algorithm that only consumes frequency bands.
It is not possible to connect an acChunk with PDM data to an algorithm that only consumes PCM data.
And so on…
sampling frequency
that would be the case for the resampler for which the resampling ratio is deduced from the in & out sampling frequencies.
remark: input vs output nbElements must also be consistent with this resampling ratio.
number of elements
When using only temporal domain data at a fixed sampling frequency, there shouldn’t be any issue with keeping NO_CHANGE as this field is the number of samples all along the dataflow.
However, in frequency domain data this field is no longer the number of samples but the number of frequency bands.
Please, refer to the WOLA algorithm documentation that provides a detailed description of this topic.
As said, the resampler also requires to fine tune this field.
Here is a non-exhaustive list of algorithms requiring updating of acChunk configuration:
g711dec / g711enc:
- bufferType must be updated according to the expected buffer
conversion.
- bufferType must be updated according to the expected buffer
conversion.
pdm2pcm,
resample,
Sfc Sample Format Converter:
- algorithm’s purpose is to modify bufferType and/or interleaving,
thus user must update acChunk configuration according to his
wishes.
- algorithm’s purpose is to modify bufferType and/or interleaving,
thus user must update acChunk configuration according to his
wishes.
wolaFwd / wolaInv
Saving and opening designs
LiveTune UI offers different buttons to manage dataflow edition and back ups:
Flash: allows to store the design on the device. The firmware has persistent storage in flash. When this button was hit, the design is automatically loaded upon reboot.
Load: allows to open a design in the canvas. Loading an existing design in LiveTune can also be done by a drag a drop of the xxx.livetune file onto the canvas. The project will be loaded and transmitted to the board but not persistently.
Save: allows to save the design currently in the canvas area. The file extension is xxx.livetune.
New: clear the canvas
Re-Org: distributes all blocks nicely
Center: distributes the dataflow in the center of the canvas
Help: opens the Welcome page with many links to different kinds of help
Status bar
When a project is loaded in LiveTune, the status bar and the header display some information concerning the firmware.
Firmware name:
- Shows a Firmware name, the name exposes a configuration or a special build for a special use case.
Firmware version:
- Shows a version Firmware number.
Heartbeat
Shows the board activity.
Heart beating: the board is alive.
Heart static: the board is not connected or crashed.
Heart white: the board dataflow is not started (page header is blue).
Heart red: the dataflow is started and active (page header is green).
Live tune version:
- Shows a Live tune version number.
UART bandwidth
- Show the real time transmission speed, the maximum theoretical speed is 102 KB/s for an UART at 921600 baud (default value). Notice the color become red when the bandwidth is close to the limit.
CPU load
- Show the real time board CPU load.
LiveTune status
Running a design
When the audio flow design is done, the user may want to run it and check the processing result. This is done pushing the “Start” button. If the dataflow is correctly configured, the heartbeat icon will toggle to red. The page header will turn to green.
The user can modify any parameters according to its use case and can check the real time processing using board microphones, loudspeaker, USB or ethernet according to the design.
However, sometimes, the dataflow has incompatible capabilities or settings and won’t start. In such a case, the heart will remain white, page header will remain blue and the log traces inside the terminal will help understanding the reason of the failure.
When an algorithm setting is done, it is possible to collapse all parameters to reduce the box size. This is achievable with a double click on the algorithm icon. The first time, the box will be collapsed, the second time the box will be expanded.
When the design is finished, the user can demo LiveTune by enabling the auto_start flag.
The auto start flags tell the board to start the dataflow embedded in the persistent storage immediately at the bootup.
User can enable the auto start using the CLI terminal commands:
cli>ac set auto_start : enable the auto start.
cli>ac clear auto_start : disable the auto start.
Tuning a dataflow
When the dataflow is successfully started, all parameters are modifiable in real time except some rare exceptions where the dataflow must be stopped before. In such a case, LiveTune will automatically stop the dataflow. The heart will return to the white.
LiveTune will also automatically stop the dataflow when user changes the topologies such as a pin disconnection, or a capability change during a playback. In that case, it is safer to restart the dataflow after modifications are done.
Mind that the parameter values will be transmitted to the board as soon as a modification is done thru a controller (slide bar, droplist box, etc…). However, it is not saved in the livetune file.
During algorithm tuning, the parameters that are modified will be shown in blue, while those that remain at their default values will be displayed in black.
Generate C-code
When the design is done and tuned, the user may want to run it without the designer. At this development step, the code allowing to manage LiveTune is no longer mandatory and your code/data size may be minimized. LiveTune allows to generate automatically this code using the Generate code button. LiveTune will show a preview of the generated code in another browser tab.
Note: Depending on the browser settings, the tab might be blocked and won’t pop up. Nevertheless, the graph is created and the link is available in the browser click on the icons as shown below.
The Copy button allows to copy the source code in the clipboard and the Save button allows to save the source code in a project folder. Then it is possible to link this file to the C-code project to generate a standalone executable.
Prepare a release
When building a product based on audio-kit, a user may start from the supported boards in this package for:
- dataflow design,
- performance analysis,
- footprint evaluation for final STM32 selection,
This is done thanks to the Designer target inside
the IDE projects.
Once STM32 selection and dataflow design is done, the user may need smaller firmware for:
- development:
- This is the
Tunertarget.- The firmware only embeds algorithms and elements that are used in the dataflow, and livetune footprint is highly reduced.
- Livetune can’t change the dataflow topology with such firmware version.
- However, it is still possible to tune the parameters of the dataflow elements:
- From the edit view: Change the parameters of the elements, in the same way you would do in full livetune mode.
- If you want to regenerate a new
TunerorReleaseFW with this new configuration, you will need to update manually the audio_chain_generated_code.c file with this new configuration (the Generate code button is not available in theTunertarget)
- Press the Enter key to enter in the CLI mode
- Enter help to show all the available commands
- This is the
- final product/release:
- This is the
Releasetarget.- The minimum source code is linked. There is no development traces and the terminal is disabled.
- The graph is not visible in the edit view
- The configuration can’t be modified
- By default, the firmware runs on a standalone target with an optimization for speed.
- This is the
The sequence to follow is:
- Generate and save the code with
Designertarget- LiveTune allows the automatic generation of this code using the Generate code button as described here.
- replace the audio_chain_generated_code.c file content (given as example with an echo use case) with your generated code.
- remark: the ‘generated code’ is given with ST specific code to
initialize audio HW in the same config than the one used when code
was generated (sampling frequency, frame duration, type of system
input, etc…); this code is only for build with
TunerorReleasetarget of IDE project; it should be replaced with your own specific code for audio HW initialization
- Implement control callbacks as explained here.
- Rebuild the project.
- Add your own specific code.
Control callbacks constraints
Why do Message-Viewer and Graph-Viewer do not display any data in Tuner target?
Some algorithms may need user-dependent pieces of code such as control callbacks. They are not generated. However, LiveTune generates empty callbacks to make the work of the developer easier while implementing the final application. For example, the spectrum algorithm exposes a control callback that provides the spectrum calculation. In the designer context, the spectrum values are sent to LiveTune to draw a graph. Upon code generation, such callbacks are generated with an empty body.
AudioChain firmware brief
As said, AudioChain FW was built to be HW independent. Data can be temporal or spectral domain and all buffer pointers are automatically incremented for both read and write making audio chain generic. Wires between blocks are handled by software multi frame buffers called audio chunks.
AudioChain software components
AudioChain is built around few software components:
AudioBuffer:
Description of audio buffer (sampling frequency, channels, duration, interleaving, sample format)
Single buffer (no ping pong)
AudioChunk:
Adds notion of number of frames on top of the AudioBuffer
AudioChain automatically updates all read write pointers when copying and/or processing samples.
These are the wires between algorithm in a dataflow.
AudioChunkList:
- An algorithm can have several input/output wires. Since wires
are AudioChunk, multiple IOs are managed with chained lists thus the
need for AudioChunkList.
- An algorithm can have several input/output wires. Since wires
are AudioChunk, multiple IOs are managed with chained lists thus the
need for AudioChunkList.
AudioAlgo:
Encapsulation of processing blocks whether it is rudimental or advanced.
It offers API for user configuration.
Requires registration of few callbacks for automatic integration inside AudioChain:
- checkConsistency
- checks that inputs & outputs are inline with algorithm’s capabilities and that algorithm’s configuration is consistent with these inputs & outputs.
- a big part of consistency check is handled generically for all algorithms. However, some algorithms may have specificities.
- init
- initialisation of algo as function of input and output chunks and static configuration (parameters which have an impact on memory allocation or that can’t be modified while algo is running)
- deinit
- algorithm closure.
- configure
- updates of dynamic configuration (parameters which have no impact on memory allocation, and which can be modified while algorithm is running).
- dataInOut
- Feeding algorithm and retrieving processed samples.
- High Priority task, therefore, it should not require high CPU load.
- Automatically triggered when System input chunk had at least a frame written.
- process
- Lower priority than dataInOut.
- Dedicated to higher CPU load processing routines.
- Automatically triggered when dataInOut decides that there is enough data.
- Automatically triggers control interface when needed.
- AudioChain offers an API to specify a priority level for the processing task. It allows to deal with algorithm for which the processing time may be longer than the frame duration without having any underrun or side effects. The API is the field “prio_level” inside audio_algo_common_t. Default value is AUDIO_CAPABILITY_PROCESS_PRIO_LEVEL_NORMAL. Setting it to AUDIO_CAPABILITY_PROCESS_PRIO_LEVEL_LOW will automatically distribute the actual processing to a lower priority thread/task.
- control
- get & set non audio data from/of the algorithm. For
instance, event like wake word detection of an ASR is handled thru
this callback.
- get & set non audio data from/of the algorithm. For
instance, event like wake word detection of an ASR is handled thru
this callback.
- checkConsistency
AudioAlgoList
- chained list of AudioAlgo for generic management
- chained list of AudioAlgo for generic management
AudioChain
Uses all of the above
Create ordered list AudioAlgo
Offers callbacks for common dataInOut, process & control; meaning that a common dataInOut/process/control call back will call all the dedicated AudioAlgo’s dataInOut/process/control callbacks respectively:
all algos’ dataInOut call-backs are in the same as thread with dataInOutPriority
all algos’ process call-backs with normal priority level are in the same os thread with processNormalPriority
all algos’ process call-backs with low priority level are in the same os thread with processLowPriority
all algos’ control call-backs are in the same os thread with controlPriority
priority levels follow this rule:
audioCapturePriority >= dataInOutPriority > processNormalPriority > processLowPriority
controlPriority level may vary depending on the use case; it may be important to have it higher than process to guaranty stable latency on control notifications.
audioCapturePriority is refereing to the priority of the audioCapture thread (it manages input/output audio codecs and fills/dumps AudioChain input/output system chunks).
Factory
- Hosts all the descriptions of common elements (parameters,
chunks)
- Hosts all the descriptions of common elements (parameters,
chunks)
System input and output
Offers API to add System Input and/or System Output to match any setup.
As an example, the SysIOs of this package are implemented in the files “audio_chain_sysIOs_conf.[ch]”.
For a simple example, please read “audio_chain_sysIOs_conf_template.c”.
AudioChain APIs
AudioChain has two levels of API:
- user API
- developer API
User API
The user API allows to design a dataflow with existing elements. It has been designed for simplicity and for stability over time.
The user API lays on top of the developer API but hides all its complexity. All elements are accessible, manageable using enumerated strings. In other words, all one needs to know to play with an algorithm is its’ name and parameter’s names. LiveTune will display all mandatory information needed for tuning. All algorithms have a constructor and a default value for each parameter.
In this way, a minimum stability at the level of API is guaranteed because the code is not exposed to the development phase where headers and contents may change at any time.
Developer API
The developer API offers a common way of integrating simple to complex processing blocks into AudioChain so that they can appear automatically in LiveTune.
The developer API allows to integrate any type of audio processing parameter user wants to expose in LiveTune.
If user wants to integrate his own processing block inside LiveTune & AudioChain, he will have to wrap it using this API. To make it easy, a script is provided for that matter. Please refer to the Algorithm Integration chapter.
AudioChain connection to HW
As explained, HW is not encapsulated inside audio chain. Thus, the user needs to connect AudioChain to any audio data coming from/to HW. The HW inputs and outputs are called System Inputs & System Outputs.
We also refer to these as SysIOs in the firmware source. Please refer to System Inputs & Outputs.
Latency of a data flow
AudioChain offers a low latency mode which is enabled through :
- the field
isProcessSpecificTaskfrom theaudio_chain_instance_params_tstrutcture.- setting it to true:
- Forces AudioChain to run its ‘underrun/overrun safe’ implementation.
- AudioChain will have one task for all algorithms’
dataInOutcallbacks and a lower priority task for all algorithms’processcallbacks. - This implementation impacts the overall dataflow latency because
processing starts when all
dataInOutcallbacks are done.
- setting it to false:
- Forces AudioChain to run its ‘low latency’ implementation.
- AudioChain will have a single task for all algorithms’
dataInOutandprocesscallbacks. - The algorithm
processcallback is called right after thedataInOutcallback is done, allowing minimal latency.
- setting it to true:
The current version of AudioChain does not apply any timestamps to audio samples. However, it is possible to measure the latency of a dataflow using the USB audio class. Please refer to these chapter.
Troubleshooting
Dataflow start failure
As mentioned, the dataflow may not start. In that case, the reason why is given inside the terminal logs.
When the user clicks the Start button, AudioChain will parse the dataflow and try to instantiate it.
It logs in the terminal all the connection made and the algorithms instantiated until an error is encountered.
As an example, here is a dataflow that fails:
As explained earlier, the connection properties are propagated (see earlier section. It means that unless the user changes the parameters, the output connection of an element has the same properties than its input. In this case, the output of the signal generator is 1 channel. Therefore, the output of the splitter will also be one channel. Then the mixer is receiving 2 channels from the microphones and 1 is white noise. This is not acceptable and AudioChain won’t start.
Ensure you understand that the error indicated in the logs may not always originate from the specified block. It could result from an incorrect configuration in an earlier block within the dataflow.
Here is an example of the traces output of this error.
split Error : output chunks number of channels compatibility issue
Now, to fix it, we just need to set 2 channels at signal generator output.
LiveTune connection failure
LiveTune is communicating with the STM32 board through UART. Therefore, if another client uses the same device UART (such as a tool like TeraTerm or another instance of LiveTune), The tool won’t be able to connect to the board. User will have to close the other instance before accessing LiveTune.
Release start failure
When building a release or a development firmware, please follow the guidance provided here. In case of dataflow start error, please check the audio configuration as explained here.
Howto tips
Integrating encoders & decoders
AudioChain supports only fixed buffer sizes between elements. So, variable bit rate encoders/decoders can be implemented as sink and source only.
LiveTune UART communication speed
The use of the UART involves a speed limitation, the transmission will never cross the maximum UART speed, but you can build a designer with several elements requiring huge transmission. It is for example the case of the graph-Viewer msg-Viewer. The Designer must pay attention to this aspect and stay reasonable in terms of UART bandwidth used. The UART transmission speed status allows help for this issue.
Memory management
Depending on the board used, the memory could be located in different pools and these pools have different properties end performance (example ITCM, DTCM, HEAP, RAMEXT, etc…)
AudioChain will try to use all memory available on the board and will allocate blocks using rules. AudioChain will apply the strategy defined by user and apply fall-back strategy if it is not possible: see Memory pool allocation configuration.
At the opposite, the livetune module will allocate its memory preferably on the external memory because the speed is not critical for this module. If the board have external memory, audio will allocate blocks only if it is not possible to allocate memory internally. This could represent a side effect. The external memory access is significantly slower compared the internal memory, if your project requires to use external memory (mainly algo using huge sample history), the use of the external memory will have a significant performance impact on the CPU load. So, for the best performance, try to minimize as mush as possible the use of this memory for the sample processing. You can monitor the memory usage using the cli>mem or cli>mem2 command.
Using USB audio device
Any DAW (Digital Audio Workstation) / recording software can be used to record audio from the STM32 however the following figures are done using audacity. The STM32 audio USB connector is not the STLink one with Virtual Com port. Therefore, the user should plug in two USB cables.
Inside the DAW, please select the STM32 as recording & play back device as shown below.
Please set the recording preferences inside the DAW. In audacity, it may be easier to activate the creation of a new track upon a recording as shown below.
Please make sure that the windows device Speakers (STM32 Headset) is not muted.
In some application, it is useful to measure delays between tracks. In audacity, it can be done as follow:
Generate a rhythm track as shown by following Figure.
Assuming option shown on previous figure is selected, start a recording while LiveTune is running a dataflow where delay needs to be measured.
Start a recording, then the generated rhythm track is sent to STM32 and the output audio is recorded from the STM32. Following figure illustrates this operation.
Audio Kit and USB
Audio kit implements USB for several classes that you can activate or deactivate:
AUDIO UAC2
CDC
HID
MIDI
During development, you can add or remove some classes or change class features. These changes have an impact on the USB descriptors. For example, if you change the audio frequency, the USB audio driver must be changed too. Changing features involves modifying the USB device descriptor. Each time we change the device descriptor, we need to reinstall the USB device (i.e., uninstall the USB device from the Windows Device Manager and replug the USB device). During this operation, the USB host keeps new information in its registry.
- 1 USB Windows driver for each device and each device configuration
So, we cannot change the device’s features without reinstalling the device… It is painful! To work around this issue, we can change the application ID and create a PID different for each possible configuration. With this tip, we can have a device with multiple static configurations that will dynamically change between different boots. In the PID, we encode some features like frequency, channels, class, etc… that are changeable between each boot.
Consequently, the USB name will change according to the USB features activated or not.
Example: STM32-UAC 2.0, 1-STM32-UAC 2.0, 2-STM32-UAC 2.0, XX-STM32-UAC 2.0
But the USB device will work perfectly.
On the contrary, when the product will be released, we need to remove this feature. If the define ST_USE_DEBUG is not present, the build will generate a unique PID for the board and its final configuration.
During development, it is common to have more than 10 drivers for the same device. Windows doesn’t like this and seems to mix up the USB device and driver. In this case, when we record or play from Audacity or another tool, the USB driver is recognized by the tool, but the class seems to transmit nothing. If this case is detected, you must remove the useless drivers from the Windows Device Manager using this process:
- Run “devmgmt.msc” from an admin terminal.
- Form the view menu, check the show hidden devices.
- From the “sound, video and game controllers”, remove all STM32 Midi and STM32 UAC 2.0 devices.
Select the device and hit Del.
- Unplug and plug again the USB device from the board.
The driver will be re-installed properly by Windows.
System Inputs & Outputs
System IOs configuration are handled inside the helper files audio_chain_sysIOs_conf.c & audio_chain_sysIOs_conf.h
These are the files that need modification to add or remove new input or output.
In our package, they implement several IOs needed to cover different use cases and demos. As a matter of fact, they use several defines to allow us to cover all these different use cases with dynamic reconfiguration.
audio_chain_sysIOs_conf_template.[ch]
provides a much simpler example with a single input and output implementation.
For the sake of simplicity, the configuration of the system IOs are fixed/hard coded.
wrapper_audio_chain.c
This helper files connects AudioChain to the main application. Amongst other things, it initiates the AudioChain and therefore allows to add all the system IOs. The main API to be used when add or removing an IO are:
AudioChainSysIOs_addIn to add a system input,
AudioChainSysIOs_addOut to add a system output.
How to describe the system IOs
Please refer to the file audio_chain_sysIOs_conf.[ch] as starting point. They implement the System IOs used in the Audio-Kit package. For an even simpler example, please refer to the file audio_chain_sysIOs_conf_template.c that implements a single input.
Header file implementation
Notes: Inside the following code blocks:
Some source code is given as examples. It can be removed or modified.
Some comments mentionning where to add your own sources are given next to the already mentioned source code.
Inside the header file, please add one define per input & output as follows:
#define AC_SYSIN_MIC_NAME "SysIn_mic" /*!< System In connected to the microphone in the default configuration */
/*
* ... Add a define per input here
*/
#define AC_SYSOUT_SPK_NAME "SysOut_spk" /*!< System Out connected to the loud speaker in the default configuration */
/*
* ... Add a define per output here
*/
please define the following enum for inputs and outputs:
typedef enum
{
AC_SYSIN_MIC, /* input for microphone */
/*
* ... Add all your inputs here
*/
AC_NB_MAX_SYS_IN /* Mandatory, please do not remove*/
} ac_sys_in_id_t;
typedef enum
{
AC_SYSOUT_SPK, /* output for loudspeakers or headphones*/
/*
* ... Add all your outputs here
*/
AC_NB_MAX_SYS_OUT /* Mandatory, please do not remove*/
} ac_sys_out_id_t;
If you need to build a Livetune configuration please define the following table:
static const char_t *tLiveTuneDefineConv[][2] =
{
{"AC_SYSIN_MIC_NAME", AC_SYSIN_MIC_NAME},
/*
* ... Do the same for all your inputs here
*/
{"AC_SYSOUT_SPK_NAME", AC_SYSOUT_SPK_NAME},
/*
* ... Do the same for all your outputs here
*/
{0, 0}
};
C configuration file implementation
This file must configure a structure which type is ac_sys_ios_t
Please keep the following variables already defined in the template file:
static ac_io_descr_t sys_in_ios[AC_NB_MAX_SYS_IN];
static ac_io_descr_t sys_out_ios[AC_NB_MAX_SYS_OUT];
static const ac_sys_ios_t sys_ios =
{
.in =
{
.nb = AC_NB_MAX_SYS_IN,
.pIos = sys_in_ios
},
.out =
{
.nb = AC_NB_MAX_SYS_OUT,
.pIos = sys_out_ios
}
};
Now the function called AudioChainSysIOs_get must be implemented. For instance, the template C file provide one input for microphone & one output for a loudspeaker as follows:
const ac_sys_ios_t *AudioChainSysIOs_get(void)
{
/* microphones pcm format */
sys_in_ios[AC_SYSIN_MIC].conf.chunkType = (uint8_t)AUDIO_CHUNK_TYPE_SYS_IN;
sys_in_ios[AC_SYSIN_MIC].conf.nbChannels = AC_SYSIOS_CH;
sys_in_ios[AC_SYSIN_MIC].conf.fs = AC_SYSIOS_FS;
sys_in_ios[AC_SYSIN_MIC].conf.nbElements = AC_SYSIOS_MS * AC_SYSIOS_FS / 1000UL;
sys_in_ios[AC_SYSIN_MIC].conf.nbFrames = 2U;
sys_in_ios[AC_SYSIN_MIC].conf.timeFreq = (uint8_t)ABUFF_FORMAT_TIME;
sys_in_ios[AC_SYSIN_MIC].conf.bufferType = (uint8_t)ABUFF_FORMAT_FIXED16;
sys_in_ios[AC_SYSIN_MIC].conf.interleaved = (uint8_t)ABUFF_FORMAT_INTERLEAVED;
sys_in_ios[AC_SYSIN_MIC].conf.pName = AC_SYSIN_MIC_NAME;
/*
* ... Add & configure all your other inputs here
*/
/* Local audio playback to audio codec */
sys_out_ios[AC_SYSOUT_SPK].conf.chunkType = (uint8_t)AUDIO_CHUNK_TYPE_SYS_OUT;
sys_out_ios[AC_SYSOUT_SPK].conf.nbChannels = AC_SYSIOS_CH;
sys_out_ios[AC_SYSOUT_SPK].conf.fs = AC_SYSIOS_FS;
sys_out_ios[AC_SYSOUT_SPK].conf.nbElements = AC_SYSIOS_MS * AC_SYSIOS_FS / 1000UL;;
sys_out_ios[AC_SYSOUT_SPK].conf.nbFrames = 3U;
sys_out_ios[AC_SYSOUT_SPK].conf.timeFreq = (uint8_t)ABUFF_FORMAT_TIME;
sys_out_ios[AC_SYSOUT_SPK].conf.bufferType = (uint8_t)ABUFF_FORMAT_FIXED16;
sys_out_ios[AC_SYSOUT_SPK].conf.interleaved = (uint8_t)ABUFF_FORMAT_INTERLEAVED;
sys_out_ios[AC_SYSOUT_SPK].conf.pName = AC_SYSOUT_SPK_NAME;
/*
* ... Add & configure all your other outputs here
*/
return &sys_ios;
}
How to add the system IOs
In our package, this is done inside the wrapper_audio_chain.c file.
Please make sure that all the desired inputs & outputs are properly defined in the files above listed will be added.
Adding the IOs is done calling AudioChainSysIOs_addIn & AudioChainSysIOs_addOut which parameters are:
pName = the name of the IO that will appear in the Livetune left panel,
pDescription = its description,
pAudioBuffer = a pointer to a structure which type must be audio_buffer_t. This is the only thing that must be done by the user, create this buffer and configure it properly.
sysIoId = the id defined in the enum ac_sys_in_id_t & ac_sys_out_id_t.
For instance, in our LiveTune package, the IOS are the following:
AudioChainSysIOs_addIn("SysIn-Microphones", "from Microphones", UTIL_AUDIO_CAPTURE_getAudioBuffer(), (uint8_t)AC_SYSIN_MIC, UTIL_AUDIO_CAPTURE_used);
AudioChainSysIOs_addIn("SysIn-MicroPDM", "from PDM Microphones", UTIL_AUDIO_CAPTURE_getAudioBufferPdm(), (uint8_t)AC_SYSIN_PDM, UTIL_ADIO_CAPTURE_used);
AudioChainSysIOs_addIn("SysIn-USB", "from USB host", UTIL_AUDIO_USB_PLAY_getAudioBuffer(), (uint8_t)AC_SYSIN_USB, NULL);
AudioChainSysIOs_addOut("SysOut-Codec", "towards audio codec", UTIL_AUDIO_RENDER_getAudioBuffer(), (uint8_t)AC_SYSOUT_SPK, UTIL_AUDIO_RENDER_used);
AudioChainSysIOs_addOut("SysOut-USB", "towards USB host", UTIL_AUDIO_USB_REC_getAudioBuffer(), (uint8_t)AC_SYSOUT_USB, NULL);
- In this case, the Utility component called stm32_audio.c configures the all the required audio_buffer_t. For instance, UTIL_AUDIO_CAPTURE_Init will configure the one for the microphones. The UTIL_AUDIO_CAPTURE_getAudioBuffer() is just a getter for the pointer, UTIL_AUDIO_CAPTURE_used is a call-back routine returning true if system IO is available (setting this call-back is set to NULL means the IO is always available).
In case of VoiceComm usecase over Ethernet we would add the following:
AudioChainSysIOs_addIn("SysIn-ETH", "from ethernet", &gDownLinkBuffInfo, (uint8_t)AC_SYSIN_ETH, NULL);
AudioChainSysIOs_addOut("SysOut-ETH", "towards ethernet", &gUpLinkBuffInfo, (uint8_t)AC_SYSOUT_ETH, NULL);
- In this case both audio_buffer_t gDownLinkBuffInfo & gUpLinkBuffInfo must be built. It can be done like this:
static int32_t s_createAppBuffers(void)
{
int32_t error = AudioBuffer_create(&gUpLinkBuffInfo,
UPLINK_BUFF_CHANNELS_NB,
UPLINK_BUFF_FS,
UPLINK_BUFF_SAMPLES_NB,
UPLINK_BUFF_FORMAT,
UPLINK_BUFF_TYPE,
ABUFF_FORMAT_INTERLEAVED,
AUDIO_MEM_RAMINT);
if (AudioError_isError(error))
{
AudioChainInstance_error(__FILE__, __LINE__, "AudioBuffer_create gUpLinkBuffInfo error");
}
if (AudioError_isOk(error))
{
error = AudioBuffer_create(&gDownLinkBuffInfo,
DOWNLINK_BUFF_CHANNELS_NB,
DOWNLINK_BUFF_FS,
DOWNLINK_BUFF_SAMPLES_NB,
DOWNLINK_BUFF_FORMAT,
DOWNLINK_BUFF_TYPE,
ABUFF_FORMAT_INTERLEAVED,
AUDIO_MEM_RAMINT);
if (AudioError_isError(error))
{
AudioChainInstance_error(__FILE__, __LINE__, "AudioBuffer_create gDownLinkBuffInfo error");
}
}
return error;
}
System IOs in this package
This is done inside the files:
audio_chain_sysIOs_conf.c
audio_chain_sysIOs_conf.h
As explained, our package implements several IOs as follows:
4 inputs
Microphones in PCM format are handled by the defines prefixed by AC_SYSIN_MIC,
USB playback is handled by the defines prefixed by AC_SYSIN_USB,
Audio over ethernet (downlink) is handled by the defines prefixed by AC_SYSIN_ETH,
Microphones in PDM format are handled by the defines prefixed by AC_SYSIN_PDM; it is available only if microphones are connected to an HW IP that doesn’t handle conversion to PCM data.
3 outputs
USB record is handled by the defines prefixed by AC_SYSOUT_USB,
Audio playback over the embedded codec is handled by the defines prefixed by AC_SYSOUT_SPK,
Audio over ethernet (uplink) is handled by the defines prefixed by AC_SYSOUT_ETH.
Using the defines listed above, the configurations can be changed in two ways:
Statically, by setting a static value for all these defines. This is what is done when using Devel & Release workspace of the different projects in the package.
Dynamically, by setting the define to some function. This is what is done when using the Livetune workspace of the different projects in the package. It allows to provide the command “ac audio” that allows to reboot the board with a new audio configuration changing either the number of microphones, or the sampling frequency for instance.
Algorithm Integration
Adding an algorithm inside AudioChain & Livetune requires to implement C code that describes the algorithm capabilities and provide functions that respect AudioChain’s API.
To make it simple, a python tool is provided:
AcIntegrate.py
requires an input json file that describes the algorithm, this is files is referred as the ID_CARD and is described below.
The $PYTHONPATH variable must be set to the folder = Middlewares/ST/Audio-Kit/src/algos/; this can be done it two ways:
From python dev environment: as usual, it must be done manually.
From bash command line: please source the set_pythonpath.sh script inside the folder = Middlewares/ST/Audio-Kit/src/algos/scripts/
AcIntegrate.py script description
To get the help menu for AcIntegrate please run the following command:
python AcIntegrate.py --help
Here is the output:
usage: AcIntegrate.py [-h] --id_card ID_CARD --root_dir ROOT_DIR
[--no_clear]
[--no_file_copy] [--data_management DATA_MANAGEMENT]
[--var_request VAR_REQUEST] [--outdir OUTDIR]
[--log_filename LOG_FILENAME]
[--report_filename REPORT_FILENAME] [--verbose]
AcIntegrate arguments details
optional arguments:
-h, --help show this help message and exit
--no_clear do not delete output directories if it exists
--no_file_copy do not copy source file inside outdir
--data_management DATA_MANAGEMENT
Generation of code for data management. By default =
None, means that the developer as to implement it;
"untyped" will generate code to manage standard data
pointer; "audio_buffer" will generate code using the
audio_buffer_t structure available in the package.
"all" will generate both codes
--var_request VAR_REQUEST
List of variables to create in context and initialize
: can be None, all or a list from following values:
['sampleSize', 'nbChannels', 'nbSamples', 'fs',
'buffSize', 'nbCells', 'nbElements', 'channelsOffset',
'samplesOffset', 'samplesOffset0', 'interleaved',
'timeFreq', 'type']
--outdir OUTDIR Name directory that will host all algos wrapper
sources
--log_filename LOG_FILENAME
Name of the file with complete logs
--report_filename REPORT_FILENAME
text file report with summarized information
--verbose more traces
required arguments:
--id_card ID_CARD files with description of algorithm to integrate
inside audioChain
--root_dir ROOT_DIR root directory with all files to integrate
Note: Please be aware that running the script twice inside the same outdir will overwrite the modification you may have done. In this case, it is better to work with several outdir if script needs to be re-run.
Two options require some explanations:
–data_management
–var_request
data_management option
This option is there to specify the type of code that needs to be generated for data management.
| Possible values | description |
|---|---|
| None | No code is generated for managing data. It is up to the developer to do it all. |
| untyped |
In this case, the script will generate the source code needed to get non typed data pointers for:
|
| audio_buffer | In this case, the script will generate the source code needed to get data pointers formatted as ST proprietary audio_buffer_t format. This requires using the middleware inside the folder Middlewares/ST/AudioBuffer/. |
Table 1 data_management option values descriptions
var_request option
While working with audio data and samples, it can be useful to get some information such as described inside Table 2.
| Possible values | description |
|---|---|
| None | No code is generated for managing data. It is up to the developer to do it all. |
| sampleSize | Size of samples in bytes |
| nbChannels | Number of channels in the data stream. |
| nbSamples | Number of samples in the data stream. |
| fs | Sampling frequency. |
| buffsize | Size of the buffer in bytes. |
| nbElements | Number of elements in the data streams. For temporal data, it is equal to the nbSamples field. However, in the frequency domain, it provides the number of bands |
| nbCells | Is the total number of data in the stream. For temporal data, it is equal to nbSamples and nbElements. However, in the frequency domain it is equal to nbElements*2 because the are the real & imaginary part of elements. |
| channelsOffset | Provides the offset to reach the next channel (byte offset = sampleSize* channelsOffset) |
| samplesOffset | Provides the offset to reach the next sample (byte offset = sampleSize* samplesOffset) |
| interleaved | To know if data stream is interleaved or not |
| time_freq | If data stream is temporal or in the frequency domain |
| type | Type of the data stream (PCM float 32 bits or fixed point 16 bits or 32 bits, PDM lsb first or msb first, or G711 encoded data) |
Table 2 var_request option values descriptions
Step1: ID_CARD creation
A documented example of ID_CARD is provided under the folder:
The script helps to build the id_card because it returns an error message when a field is missing or has a wrong value.
| field | description | Is required? | comment |
|---|---|---|---|
| name | Name of the algorithm | mandatory | |
| prio_level |
Can be “normal” vs “low”
|
mandatory |
Amongst other cases, it can be useful with an ASR for instance (“low” must be used if processing load is irregular and may sometimes exceed the duration of a frame). In such case, the ASR processing can be tuned to “low” so that audio signal processing is guaranteed all along the data flow. |
| description | description to provide details about the algorithm | mandatory | |
| ios_consistency | Allows to describe if some input pin characteristics should be the same ones as the ones of the output pin. For instance, setting ios_consistency = “nbChan” will force input and output pins to have the same number of channels. | mandatory | |
| api_list |
Hosts the list of C files that have some API information of the algorithm.
|
optional | |
| source_list |
Hosts the list of C files that needs to be linked in the project but
are not API. if option –no_file_copy is not used, the files are copied in the output directory |
optional | |
| input_pins | Allows to describe the algorithms inputs (name, description, and parameters) | optional | See Table 4 for detailed description |
| output_pins | Allows to describe the algorithms inputs (name, description, and parameters) | optional | See Table 4 for detailed description |
| static_params | Describes parameters which have impact on memory allocation | optional | See Table 5 & Table 6 for detailed description |
| dynamic_params | Describes parameters which don’t have impact on memory and can benefit from on-the-fly tuning (no de-initialization then initialization necessary) | optional | Script will warn you in case of wrong settings. For instance if pControl is left empty or wrong, the script will provide with the possible values |
| control_params | Describes parameters that are dedicated to control of the algorithm. The control structure allows to set or get algorithms variables during execution. This is done through a callback mechanism. Registering the callback is done with AudioAlgo_setCtrlCb. | optional | Described the same way as static & dynamic parameters except that some fields are not necessary. Please refer to Table 6. |
| callbacks | Describes the list of callbacks for the algorithm. If NULL, nothing done. | mandatory |
Table 3 Fields descriptions
| field | Description | Is required? | comment |
|---|---|---|---|
| consistency |
It allows to check if some parameters of the IO should cope with some restrictions. For instance, as of today, frequency domain processing only works with floating point and non interleaved data. In such case the setting should be: “consistency” : [‘interleaving’, ‘type’] |
optional | is not mandatory but helps while debugging |
| nb | Number of pins | mandatory |
They can be described as a value or a list of possible values. The script will send warning in case of wrong settings providing with the possible values. Setting an algorithm can be done using this capability. For instance, let’s say an algorithm can support mono and/or stereo then the “nbChan” field should be: “nbChan” : [“1ch”,“2ch”] To get the possible value, just fill in with a wrong data such as “nbChan” : “xxx” The script will output the possible ones. |
| nbChan | Number of channels | mandatory | |
| fs | Sampling frequency | mandatory | |
| interleaving | Interleaved or not | mandatory | |
| Time_freq | Time or frequency domain | mandatory | |
| type | Type of supported data (PCM floating or fixed point or PDM data or other format such as G711 encoded data) | mandatory | |
| list | provide name & description of all IOs | mandatory |
Table 4 Pins description
| field | description | Is required? |
|---|---|---|
| struct_name | Name of the C structure with parameter fields. | mandatory |
| params |
Python dictionary which: + keys are the parameter name + values are the parameters details described in Table 6 |
mandatory |
Table 5 Parameters description
| field | description | Is required? | comment |
|---|---|---|---|
| pDefault | Default value | mandatory | Not needed for the “control_params” field |
| pDescription | Textual description | mandatory | |
| pControl | Type of control that will appear in LiveTune | mandatory |
Not needed for the “control_params” field To get the possible value, just fill in with a wrong data such as “xxx”. The script will output the possible ones. |
| min | Minimal support value | mandatory | Not needed for the “control_params” field |
| max | Maximal support value | mandatory | Not needed for the “control_params” field |
| type | C type of the parameter fields (float, int, etc…) | mandatory |
Table 6 Parameters details description
| field | description | Is required? | comment |
|---|---|---|---|
| init | Allocation and init as function of static parameters if any. | mandatory | |
| deinit | free | mandatory | |
| configure | Applies dynamic parameter configuration | optional | No impact on memory allocation |
| dataInOut |
Retrieves input data from the input IOs Provides output data the output IOs. It is a mandatory routine but can be empty of any audio task. However, in order to trigger the processing callback, AudioAlgo_incReadyForProcess(pAlgo); MUST BE called. In case where no processing callback is needed, calling AudioAlgo_incReadyForProcess is not necessary. |
mandatory |
Doesn’t need to be called at the same pace as processing callback. For instance, let’s assume dataInOut @ 1ms but processing must be called @ 8ms; then developer will have to ensure that dataInOut calls AudioAlgo_incReadyForProcess every 8 calls. In a low-latency driven system it can host the complete processing, otherwise, it is better to delegate the main part of the processing to the process callback which is automatically triggered when input IOs had at least a frame written. The dataInOut callback of all algorithms in a chain are called within a single task/thread |
| process |
Should host high CPU demanding processing. It must have a lower priority than dataInOut. Two priority levels are available: “normal” vs “low” as explained in Table 3 |
optional |
optional routines; everything can be done in dataInOut as explained. The process callback of all algorithms in a chain are called within a single task/thread which priority can be tuned lower than the thread of the dataInOut |
| checkConsistency | checks that inputs & outputs are inline with algorithm’s specific tuning. Most of consistency checks are generic however some check depending on parameter values can be implemented there. | optional | For instance, a data resampler which API would offer a conversion ratio should make sure that the in & out IOs frequencies match this ratio. |
| control | Allows user to enroll a callback that will provide control data | optional | For instance, the RMS algorithm is using this mechanism. RMS value is processed continuously. However, the value is provided back to application as function of refresh rate parameter. It is done through this callback. |
Table 7 Callbacks description
Step2: Generate code
As explained, AcIntegrate.py will generate source code from the id_card. Once the code has been generated, the user needs to connect to its own functions manually. Indeed, AcIntegrate.py doesn’t know how to properly call the algorithms API, i.e. if it has specific format for input & output data pointers or if it requires some parameters as arguments of different function such as init, etc…
The id_card called “id_card_speex_nr.py” inside the folder Middlewares/ST/Audio-Kit/src/algos/template/ is given as an example of how we integrated the algorithm called “speex_nr” that is inside Middlewares/ST/Audio-Kit/src/algos/speex_nr/. In the generated file “audio_chain_speex_nr.c”, most of the init code was generated in the function called:
static int32_t s_speex_nr_init(audio_algo_t *const pAlgo);
However, it is still needed to call the low level init API. It is done like this:
pContext->user.pHdle = speex_preprocess_state_init((int)nbSamples, (int)fs); /* Store the provided handler in local context */
In this specific case, since both sampling frequency and number of samples are mandatory parameters, the script was run with the option:
--var_request nbSamples,fs
Step3: Integrate the algorithm
At this stage, the generated files must be linked to the workspace project (IAR or STM32CubeIDE). Once compilation issues are all solved, the algorithm should appear in the LiveTune designer.
How to chose a list of algorithms
In order to reduce memory footprint, it might be necessary to reduce the list of algorithms. This case may happen when:
Building a final dataflow from the “Generate” button in Livetune.
- In this case, the list of required algorithms is generated and
framed by the define “ALGO_USE_LIST”.
- In this case, the list of required algorithms is generated and
framed by the define “ALGO_USE_LIST”.
Porting Livetune & AudioChain to STM32 with smaller memory size.
There are two ways of building a selection of algorithms:
Excluding files from compilation
Inside IAR or STM32CubeIDE, exclude the algorithm folder from compilation.
This method makes the overall build faster.
In case, few algorithms need to be removed it is probably the easiest way.
Set the define “ALGO_USE_LIST” and build a subset list of algorithms framed by the define “ALGO_USE_LIST”.
As explained, this is the method used when generating a LiveTune dataflow.
User can start from the complete list and remove the algorithms that are no required.
The complete list is available in the file “audio_chain_algo_list_template.c” in the folder Middlewares/ST/Audio-Kit/src/helpers/
Source Code Integration
There are two ways of integrating Audio-Kit:
- start from a project from this package, then port it to the targeted STM32 and complete with your own application.
- start from an existing project and incorporate software components delivered in this package.
This section provides detailed explanations of all the software components. It also explains which ones are mandatory or not as function of the user requirements. However, it is possible to integrate Audio-Kit without going through all the details thru the installer; please refer to this section.
Audio-kit Middleware is made of two main components:
- Livetune to handle communication with the host
- AudioChain to handle audio data flow
It lays on top of a third middleware called AudioBuffer which
provides the audio_buffer_t used by
AudioChain. It also uses some third party
middlewares providing audio processing. At last, it uses a set of
utilities in order to support features such as cycles & memory
monitoring, traces, terminal commands, json interpreter, etc…
Livetune integration is needed for full feature or development
targets only. However, the release projects only use
AudioChain.
As explained here, Audio-Kit offers three types of builds:
- Designer:
- uses LiveTune & AudioChain
- Offers the capability to build dataflows and execute them.
- Tuner:
- LiveTune is not linked whereas AudioChain is.
- Allows tuning of dataflows after they have been generated. The footprint is smaller than in Designer mode.
- Release:
- LiveTune is not linked whereas AudioChain is.
- Only includes what is necessary to run the generated dataflow, resulting in the smallest footprint.
Software components description
Please bear in mind that software components have their own release notes providing some description as well. The firmware source code contains:
Utility Components:
- Audio:
- Offers a common API to manage audio through a callback registering mechanism, allowing different BSP APIs or hardware to be handled with a single API.
- Starts microphone recording and playback in sync.
- Performs audio capture conditioning.
- Encapsulates USB record and play to provide an
audio_buffer_tAPI. - Optional: Users may prefer to directly call their own API to manage Audio hardware.
- Terminal:
- Implements a generic console.
- The set of commands can be customized for application needs.
- Optional in the end application but mandatory to use livetune.
- CyclesCnt:
- Monitors CPU load.
- Tracks task interruptions to provide consistent data.
- Optional if cycle monitoring is not wanted
- Traces:
- Single API for sending traces over different outputs (UART & display if needed).
- Optional: User may prefer to use his own traces.
- STOs:
- Wrapper of the CMSIS-OS with added features.
st_os_memprovides services to create memory pools for flexible dynamic allocation (built on top of STPmem).- Optional but strongly recommended for Livetune usage:
- Not used in bare-metal implementations.
- If RTOS is used, each software component that creates an OS task has its own API using STOs. To remove STOs, re-write these APIs with direct calls to the chosen RTOS.
- Without
ST_OS_mem, using all memory banks for dynamic allocation will require another service. - Mind that if it is not used, the specificities implemented inside the st_os_compiler_support.ch files should be analyzed carefully.
- STJson:
- API to read and write JSON. Used for communication with the host or to describe audio data flows.
- Optional: Can be removed if UART commands are not needed. Mandatory if Livetune is used.
- STPmem:
- Memory pool manager.
- Optional:
malloccan be used instead, but dynamic memory allocation won’t be possible in all banks without a similar service. - Care must be taken when adding or removing STPmem. Refer to the relevant section.
Middleware Components:
- PDM2PCM Conversion Library:
- Provided by STMicroelectronics.
- Audio-kit also offers a CIC decimation filtering as an alternative.
- Optional if the STM32 used offers a hardware block that performs the decimation to PCM samples.
- AudioBuffer:
- Offers the
audio_buffer_tAPI. - Manages simple audio buffers.
- Manages audio mallocs (used inside algorithms and AudioChain);
these mallocs may use
ST_OS_mem/STPmemmallocs (throughst_os_mem.cAPI) ifAUDIO_MEM_CONF_STOS_USEDis defined, otherwise they use standard library mallocs. - Mandatory
- Offers the
- AudioChain:
- STMicroelectronics firmware that allows quick creation of audio processing chains.
- please refer to this section for more details.
- Mandatory
Common files for managing hardware & software initialization, run, and deinitialization of different boards:
Using this files is not mandatory. However, the developer may need to mimic what is done to enable services.
Located in
Projects\STM32xxx\Applications\Livetune\Common:- RTOS Hooks:
- Overwrites
main_rtos_initto connectst_os_init. - Used to create a background task (e.g., for cycle count).
- Implements some RTOS functions such as
vApplicationStackOverflowHook, etc.
- Overwrites
- stm32xx_it.c:
- Single file for all boards.
- Optional: the developer may use his own version.
- main_hooks.c:
- Contains
main_hooks*functions that areweak. They offer a “plug & play” means of connecting different services among the ones described before. - It also offers some handy features such as stack monitoring. Of course, good care must be taken with the scatter file (linker file).
- Optional
- Contains
- boardSetup.ch:
- They provide a common API to start the hardware used in this package.
- Optional
- audio_persist_config.c:
- Manages
ac_audiocommands (change audio config with a reset). - Optional: In general, only one audio configuration is used. We
use this mechanism for the user to select/test different audio
configuration. Mind that a switch from one configuration to another
requires an hardware reset.
- Manages
- RTOS Hooks:
Located in
Projects\STM32xxx\Applications\Livetune\Platform:- Common Files per Board:
- Implements
core_init.c(system clock, MPU config, etc.). stm32_audio_setup_*.c:- Board-dependent source code of the Utility/audio service used to manage audio hardware.
- Optional if BSP is used directly. However, please ensure the
audio_buffer_tstructure is created to register the SysIOs.
- audio_config.c:
- Implements all
ac_audioavailable commands on a board. - Not needed if
audio_persist_config.cis not used.
- Implements all
- Patch Folder:
- Contains patches for issues with BSP implementations (e.g., missing code for microphone expansion boards, inadequate decimation configuration, potential bugs in BSP or HAL).
- These patches are mandatory to maintain the feature level in the package but may be removable if not needed.
- Implements
- Common Files per Board:
AudioChain Integration in the application
Adding the AudioChain firmware inside a project requires to call few C APIs but it also requires taking good care about some system constraints to avoid drift issues.
The AudioChain library has already been briefly described in this chapter and more details are provided here.
The AudioChain core is hardware-independent. However, its user instance requires links to hardware (Audio System IOs, UART, etc.) and other software services (cycle count, memory monitoring, traces, OS management, etc.).
The System IOs are hardware-dependent:
- See chapter System Inputs & Outputs.
- For instance, replacing tinyUsb with USBx is quite straight
forward. In this package, tinyUsb is wrapped inside
Utilities/Audio/stm32_audio_tinyusb.c. This file can be used as a reference. It creates theaudio_buffer_tvariable required to register the USB system IOs.
AudioChain Instance
In the package, an instance is created and configured in the C-files called
audio_chain_instance.[ch]
This helper file is provided as examples. It is located in the folder Middlewares/ST/Audio-Kit/src/helpers amongst other helper files for registration of user services to the core (OS, CycleCnt, Traces, etc.).
Before starting AudioChain, the following tasks must be done at initialization:
adding System input or Output
API AudioChainSysIOs_addIn & AudioChainSysIOs_addOut
Please refer to the System Inputs & Outputs chapter for more information.
initialize the library
API = AudioChainInstance_init and audio_chain_instance_params_t
Initialize the AudioChainInstance
This file has been implemented for the need of our package. It can be modified and simplified. In our package, we registered utilities for traces, cycles counting, logs of different types, as well as OS tasks created in files:
audio_chain_tasks_cmsisos (wrapper of FreeRTOS tasks) => mandatory to run liveTune with our package
audio_chain_tasks_no_os (wrapper for SW tasks; i.e. baremetal implementation) => can be used for target release (final product) when liveTune is no longer needed.
Services activation is through the audio_chain_instance_params_t argument as depicted in Figure
initialize the dataflow
API = AudioChainInstance_initGraph.
In case of usage with LiveTune, this routine is hooked to allow the design phase.
In case of target delivery, this function is generated (Generate button in LiveTune).
initialize tuning.
API = AudioChainInstance_initTuning.
In the case of a final product after everything was tuned, it is no longer necessary to call this function.
audio_chain_instance_params_t params =
{
.traceEnable = true,
.isDataInOutSpecificTask = true,
.isProcessSpecificTask = true,
.isControlSpecificTask = true,
.logInit = false,
.logCmsisOs = UTIL_AUDIO_LOG_TASK_QUEUE_LEVELS
};
AudioChainInstance_init(¶ms);
AudioChainInstance initialization example.
When stopping the system, the AudioChain closure should be as follows:
if (AudioChainInstance_isStarted())
{
AudioChainInstance_deinitTuning();
AudioChainInstance_deinitGraph();
AudioChainInstance_deinit();
}
While the LiveTune application is running, we sometimes need to check if the dataflow needs to be de-initialized. This is managed by the AudioChainInstance_idle API. This function should be called inside a background or idle task.
To run AudioChain, the API AudioChainInstance_run should be called at a regular pace. Two defines must be set properly:
#define AC_N_MS_PER_RUN
Is the number of milliseconds between each call of AudioChainInstance_run.
This value can be different than frame size processed by AudioChain but the frame size should be a multiple of this value. In such case, AudioChainInstance_run will be called several times before the processing is triggered.
#define AC_FRAME_MS
This is the amount of data that AudioChain is expecting before triggering dataInOut, process & control call-backs.
The frame size can be tuned; by default, we set #define AC_FRAME_MS 8UL. However, it can be changed to 1 ms thanks to the ac audio command. Mind that it may increase the overall CPU load as it may end up making higher number of tasks context switches.
In our system, the AudioChainInstance_run is triggered by the Microphones DMA interruption which occurs every UTIL_AUDIO_N_MS_PER_INTERRUPT = 1 ms. Therefore, the AC_N_MS_PER_RUN is set to UTIL_AUDIO_N_MS_PER_INTERRUPT.
To ensure no drift or asynchronism, we took good care that all System Inputs and Outputs are well synchronized with this interruption:
Audio playback: the audio codec is clocked from the same source as the microphone and is started in synchro with the microphones.
The USB audio data are retrieved/sent just before/after calling AudioChainInstance_run.
It is possible to call the AudioChainInstance_run from other interruption than the microphones but one may take good care about drift and synchronization of the system inputs and outputs with this new triggering point. Also, AC_N_MS_PER_RUN should be updated accordingly.
Connecting AudioChainInstance to the application
- The API provided by
audio_chain_instance.cis called inside the filewrapper_audio_chain.c:- It overwrites a set of
main_hooks_*functions to call AudioChain from themain_hooks.cfile. If you have a differentmain_hooks.c, you can call theAudioChainInstanceAPI directly in your main. Please see:main_hooks_audioInitfor initializing AudioChain. You will see that we also perform needed peripherals initialization there, but you can do it wherever it suits you.main_hooks_audioDeInitfor service closure.main_hooks_audioIdlecallsAudioChainInstance_idle().
- It overwrites a set of
- It also overwrites the function
UTIL_AUDIO_feed:- Called by Utility/Audio components via the function
UTIL_AUDIO_process(). - Again, in this package, the system is started and synchronized
with the Microphone DMA interruption. The
UTIL_AUDIO_processis called in the OS taskAudioIn_Threadtriggered by the Mic DMA IT. This is one way of doing things. In case there is no digital microphone but audio data coming from an SAI interface, it is possible to trigger from the SAI DMA interruption. You could also trigger from SAI play DMA IT. However, we ensured that microphones and loudspeakers In & Out are in sync in our system to avoid using drift compensation. In the case of USB, the feedback pipe is needed so that the host fixes the jitter/drift. UTIL_AUDIO_feedmakes sure that all System inputs are filled then call AudioChainInstance_run() before sending System output data where ever it needs to be sent.
- Called by Utility/Audio components via the function
Here is an example of code:
/* Retrieve samples from USB Audio device Class into buffer used as AudioChain USB system input
* Host is PC most of the time or can be anything else */
if (AudioError_isError(UTIL_AUDIO_USB_PLAY_get(pAudioChainBuff_usbPlay)))
{
AudioChainInstance_error(__FILE__, __LINE__, "UTIL_AUDIO_USB_PLAY_get() error");
}
/* UTIL_AUDIO_feed is called from audio microphone dma IT. Since output dma
* is synchronized with it and that we just retrieved samples from USB the
* all audioChain IOs are set , we can run AudioChain */
AudioChainInstance_run();
/* Sends samples to host via USB */
/* If second buffer is NULL then only first buffer is sent; in case it is mono
* it will be duplicated since USB rec is always stereo*/
main_hooks_audioFeed(pAudioChainBuff_usbRec, NULL);
Includes and symbols
In order to get the stack monitoring features implemented in our
main_hooks.c please set the
#define STACK_MONITORING
Here is the list of defines needed for Speex integration
- FLOATING_POINT
- USE_SMALLFT
- OS_SUPPORT_CUSTOM
Designer mode
Here is the list of defines needed for the Livetune designer mode:
- AUDIO_CHAIN_ACSDK_USED /* ACSDK is always used in this package, even for code generation */
- AUDIO_CHAIN_CONF_TUNING_CLI_USED /* Enables tuning thru the terminal CLI */
- AUDIO_MEM_CONF_TRACK_MALLOC /* Instrumented Malloc for monitoring and easier debug */
- AUDIO_ASSERT_ENABLED /* Enable numerous Assert in the code */
Release mode
Here is the list of defines needed for the Release mode (LiveTune is not linked whereas audioChain is):
- AUDIO_CHAIN_ACSDK_USED /* ACSDK is always used in this package, even for code generation */
- AUDIO_CHAIN_RELEASE /* Specify release mode to remove assert, and other useless feature for a release. Can be done otherwise ...*/
- RTOS_BACKGROUND_TASK_ENABLE=0U /* The background task is used in Livetune mode to help cycle count. In release mode, it is no longer useful. */
- ALGO_USE_LIST /* The `ALGO_USE_LIST` define is used to specify we no longer use the full list of algorithms but only the one needed by the generated dataflow.*/
Tuner mode
Here is the list of defines needed for the Livetune Tuner mode:
- AUDIO_CHAIN_ACSDK_USED /* ACSDK is always used in this package, even for code generation */
- AUDIO_CHAIN_CONF_TUNING_CLI_USED /* Enables tuning thru the terminal CLI */
- AUDIO_MEM_CONF_TRACK_MALLOC /* Instrumented Malloc for monitoring and easier debug */
- AUDIO_ASSERT_ENABLED /* Enable numerous Assert in the code */
- ALGO_USE_LIST /* The `ALGO_USE_LIST` define is used to specify we no longer use the full list of algorithms but only the one needed by the generated dataflow.*/
Install Audio-Kit in a CubeMx repository
This package is delivered with its IoC and they can be opened by CubeMx. However, if a user want to integrate Audio-Kit in an CubeMx repository, it is possible to install Audio-kit thanks to the script install.py inside the folder Tools.
The script expects one argument which content is a json such as in the template file given next to the script. This json file has three attributes:
- source_folder = string with audioKit package root path “C:/PathToAudioKit/”,
- destination_folder = string with CubeMx repository in which AudioKit should be installed; i.e “C:/PathToCubeMxRepository/”,
- installer_infos = dictionary that describes files & folders
to install. The template file
InstallTemplate.jsoninside the folder Tools provides an complete list of source code & tools needed.
Integrate AudioKit in IOC project
The IOC files delivered in this project allows to generate ready
to use projects thanks to post generation steps. The script is
called postGen.py and is registered in CubeMx under
ProjectManager/Code Generator field
After Code Generation. postGen.py takes no
arguments for ease of use.
Audio-Kit is added to CubeMX project using the .extSettings file which CubeMX uses to link some external files in a project code generation.
CubeMX way of handling Multi-Application is to have one IOC per application. As explained before, the Audio-Kit has three applications:
- Designer,
- Tuner,
- Release.
As a matter of fact, in a CubeMx environment Audio-Kit is delivered with three different application folders folders, each containing an IOC file and its own .extSettings file.
There are some known issues with CubeMX Code Generation for
Audio-Kit that the script postGen.py solves:
- ThreadX Generated Files:
- Appli/Src/tx_initialize_low_level.S needs to be patched. See the explanation below for why this is necessary in the section ; Why generate code low_level_initialize.S for ThreadX is wrong.
- HAL_conf.h Files:
- Need to be fixed because some IPs are included through the .extSettings mechanism, which doesn’t allow setting register callbacks (e.g., USE_HAL_UART_REGISTER_CALLBACKS 1U).
- Assembler Section Defines:
- CubeMX code generation doesn’t allow adding defines in the assembler section, but Audio-Kit requires this for the release target. See the “Fixing Assembler Options” section below.
- Float with printf:
- The option to use float with printf needs to be checked.
- AudioChain and AudioChainAlgos libraries are built with maximum
alignment size of 4 to be as much as possible compatible with most
build system (IAR, STM32Cube) and libraries coming outside. with
STM32Cube build system, it means that application must be built with
additionnel -fpack-struct=4 option
The Audio-Kit provided IOC files are implemented for secure-only mode with the context name “Appli”. To link files to a secure/non-secure application, please modify the .extSettings as follows:
- Change
[Appli:Groups]to[AppS:Groups]or[AppNs:Groups], ensuring proper distribution of the code.
Why low_level_initialize.S for ThreadX is wrong
In the CubeMx generated version of the low_level_initialize.S:
- the OS tick cycles count is wrong, it doesn’t take into account actual core clock frequency and number of ticks per second; it uses hard coded value inside assembler file instead. For OS tick cycles count, it could use SystemCoreClock and TX_TIMER_TICKS_PER_SECOND to compute the right value. See below the patch proposal, it should be “(SystemCoreClock / TX_TIMER_TICKS_PER_SECOND) - 1” in C code or in assembler, instead of:
/* Configure SysTick. */
MOV r0, #0xE000E000
LDR r1, =SYSTICK_CYCLES
STR r1, [r0, #0x14] // Setup SysTick Reload Value
MOV r1, #0x7 // Build SysTick Control Enable Value
STR r1, [r0, #0x10] // Setup SysTick Control
it could be:
/* Configure SysTick. */
LDR r0, =SystemCoreClock
LDR r0, [r0]
LDR r1, =TX_TIMER_TICKS_PER_SECOND
UDIV r0, r0, r1
SUB r1, r0, #1
MOV r0, #0xE000E000
STR r1, [r0, #0x14] // Setup SysTick Reload Value
MOV r1, #0x7 // Build SysTick Control Enable Value
STR r1, [r0, #0x10] // Setup SysTick Control
- the _tx_initialize_low_level routine disables interrupts (“CPSID i” assembler instruction) at the beginning of the routine but it doesn’t re-enable it at the end (it should add “CPSIE i” instruction); thus if user doesn’t re-enable it himself after osKernelStart(); there is no interrupt and FW is blocked; thus before:
/* Return to caller. */
BX lr
we can add:
/* Re-enable interrupts */
CPSIE i
How to reduce latency
First of all, please mind that some low latency “ac audio” commands are available. They implement what is described in this section except setting DMA IT < 1ms which requires tuning some define before compilation as described below.
Latency of the overall system is composed of the following elements:
latency of system inputs and outputs; this is mainly due to buffering but can also be from processing group delay; i.e an audio codec HW processing may add a bit of latency.
latency of the AudioChain dataflow:
- frame size
- number of frames used for ping pong
- algorithms group delay
- chosen latency mode. Please refer to this chapter.
Thus, the main parameters we can work on to reduce the latency are the buffering size, the algorithm configuration (while designing a system, we may accept for instance less coefficients of FIR filter to compromise towards a reduced latency) and the AudioChain latency mode.
System IOs latency
In this package, the systems IOs are:
- SysIn-Microphones from Microphones
- SysIn-MicroPDM from PDM Microphones
- SysOut-Codec towards audio codec
- SysOut-USB towards USB host
- SysIn-USB from USB host
The first three are handled by the Utility component called
stm32_audio.[ch] inside the Utilities/Audio folder.
This component allows to tune the duration of the DMA IT down to 1
ms thanks to the UTIL_AUDIO_N_MS_PER_INTERRUPT define.
However it is also possible to configure stm32_audio to work with
sub ms DMA IT thanks to the UTIL_AUDIO_N_MS_DIV define,
in any case user must insure that UTIL_AUDIO_N_MS_PER_INTERRUPT *
UTIL_AUDIO_N_MS_DIV * 1000 is a divider of audio sampling frequency
to guarantee an integer number of samples per audio buffer.
In case of RTOS implementation, please mind that the OS tick rate should be tuned according to the audio rate: it should be equal to max(1000, 1000 * UTIL_AUDIO_N_MS_DIV / UTIL_AUDIO_N_MS_PER_INTERRUPT) i.e. it should be equal to 1000 if UTIL_AUDIO_N_MS_DIV = 1 whatever the value of UTIL_AUDIO_N_MS_PER_INTERRUPT, 2000 if UTIL_AUDIO_N_MS_DIV=2 and UTIL_AUDIO_N_MS_PER_INTERRUPT=1, 4000 if UTIL_AUDIO_N_MS_DIV=4 and UTIL_AUDIO_N_MS_PER_INTERRUPT=1, etc…
With FreeRTOS, OS tick rate is given by
configTICK_RATE_HZ inside the
FreeRTOSConfig.h file; with AzureRTOS (or ThreadX), it
is given by TX_TIMER_TICKS_PER_SECOND inside the
tx_user.h file.
Firmware is delivered with an OS tick rate equal to 1000, an higher OS tick rate will reduce audio latency but it will increase CPU load (more RTOS context switch, higher overhead in audio interrupts, in audio-chain framework and in algos’ processing; more memory cache refill).
There are several means of measure depending on what the user is trying to monitor:
- the overall latency
- the audioChain dataflow latency
- the USB latency
Overall latency measure
Setup description
The goal of this setup is to measure the latency from the microphone to the loudspeaker. This setup requires no intervention in the firmware source. The setup is illustrated by the figure below. It requires a sound card with two inputs in order to record the signal from the audio codec jack connector (Headphone signal) and the signal from an analog microphone located next to the STM32 Mems Microphone which is on board. Then, from audacity, start a recording while playing a very recognisable wave file such as a 5 periods sinus. The latency can then be measured under audacity comparing both channels of the recorded wave file. Please see the measured latency with such method using a simple passthrough algorithm from microphone to loudspeaker.
Latency from SysIn-Microphones to SysOut-Codec
This setup has been used to measure latency with a stress test (8 channels 100 coefs FIR at 48 kHz) livetune dataflow as described below:
Following table provides latency figure for this latency stress test. The formula between these figures is:
- given x = audio dma IT
- given y = audioChain frame size
- given l = latency mode added frame; if low latency mode is enable l = 0 otherwise l = 1.
- assuming epsilon is the audio codec added latency (internal processing & buffering). It is measured ~ 0.5 ms.
- note: in following measure x = y except for the 8 ms frame size
latency = 2 * x + (l + 1) * y + dataflow group delay + epsilon
Therefore, assuming a simple passthrough without group delay; if x = 1 ms & y = 1 ms and l = 1; the minimum latency will be ~3.5ms. In the same conditions with the stress latency test (a linear phase FIR adds a theoretical group delay of ‘filter length’ / 2 samples; a minimum phase FIR adds a much smaller group delay), we measure 4.6 ms meaning that the group delay component is ~1.1 ms (very close to theoretical group delay: 50 coef / 48 kHz).
latency & MHz figures measured on a STM32N6570-DK platform:
| Frame duration (ms) | Audio Chain latency mode | Measured audio loop-back latency (ms) | Algos MHz | Audio framework MHz | Audio interrupts MHz | FreeRTOS MHz |
|---|---|---|---|---|---|---|
| 8 | disabled | 18.58 | 279 | 6.5 | 9 | 3 |
| 1 | disabled | 4.56 | 286 | 20 | 9 | 11 |
| 1 | enabled | 3.60 | 286 | 20 | 9 | 11 |
| 0.5 | enabled | 2.08 | 295 | 38 | 12 | 21 |
| 0.25 | enabled | 1.33 | 313 | 77 | 18 | 40 |
Following table shows how to tune the firmware to reach sub ms DMA IT and frame duration.
| Frame duration (ms) | UTIL_AUDIO_N_MS_PER_INTERRUPT & AC_N_MS_PER_RUN | UTIL_AUDIO_N_MS_DIV & AC_N_MS_DIV | AC_FRAME_MS |
|---|---|---|---|
| 8 | 1 | 1 | 8 |
| 1 | 1 | 1 | 1 |
| 1 | 1 | 1 | 1 |
| 0.5 | 1 | 2 | 1 |
| 0.25 | 1 | 4 | 1 |
| 0.125 | 1 | 8 | 1 |
Latency of an AudioChain dataflow
Assuming AudioChain latency mode has been enabled as explain here.
The procedure to measure a dataflow latency is as follows (it doesn’t include system IOs latencies):
- In the project, ensure that the define LATENCY_CHECK is set inside the file wrapper_audio_chain.c.
#ifdef LATENCY_CHECK /* This mode is for debug or investigation */
/* Sets second buffer to what is coming from USB to measure latency of going
* through AudioChain
* Both buffers can be multichannels, by default slot 0 are sent for both.
* However it can be configured through :
* - UTIL_AUDIO_USB_REC_setChannelsId(uint8_t ch1, uint8_t ch2)
*/
audio_buffer_t const *const pAudioChainBuff_codecPlay = UTIL_AUDIO_RENDER_getAudioBuffer();
if ((pAudioChainBuff_codecPlay != NULL) && (pAudioChainBuff_codecPlay->pInternalMem != NULL) &&
(pAudioChainBuff_usbPlay != NULL) && (pAudioChainBuff_usbPlay->pInternalMem != NULL))
{
main_hooks_audioFeed(pAudioChainBuff_codecPlay, pAudioChainBuff_usbPlay);
}
#else
/* If second buffer is NULL then only first buffer is sent; in case it is mono
* it will be duplicated since USB rec is always stereo*/
main_hooks_audioFeed(pAudioChainBuff_usbRec, NULL);
#endif
- The source code provided streams the input and output of the AudioChain dataflow to USB, allowing manual latency measurement through recording.
- In Livetune, start a graph that takes input from the USB system input SysIn-USB and streams the output to be measured to the audio codec system output SysOut-Codec.
- Open your favorite DAW (such as Audacity or another).
- Ensure the audio settings are correct. If your dataflow runs at 16kHz, set the Audacity project to 16kHz as well, to avoid sample rate conversion on the computer side impacting the measurement.
- Generate a pulse or another recognizable signal.
- Select the STM32 UAC2.0 as the USB playback device.
- Play the pulse and record the USB input in parallel.
- In Audacity, measure the latency between the left and right channels; this corresponds to the overall dataflow latency. See following figure:
Note that the minimum achievable latency is given by this formula:
- ceil(AC_SYSOUT_SPK_NBFRAMES/2) * AC_SYSOUT_SPK_MS *
AC_SYSOUT_SPK_FS / 1000
- this is the formula if SysOut-Codec is used;
- if SysOut-USB is used instead, please use the corresponding defines.
- For instance, at 16 kHz with a frame size of 8 ms and an output number of frames of 3, the minimal latency will be 256 samples. [ceil(3/2) * 16000/1000 * 8 = 256]
- Therefore, two parameters have a major impact:
- The frame size: Reducing it to 1 ms will reduce latency but may increase CPU load (more context switches).
- The output number of frames.
USB latency setup
This setup requires an USB analyzer.
For this measure, the board dataflow is an usb loopback. The PC sends edges to the board. Spying USB exchanges between the PC and the board allow checking the delay between rising or falling edges between OUT and IN endpoints. It gives the board latency, see picture below.
In order to reduce the usb latency on board please locate the
tinyusb.h header file. Please locate these defines:
#define ST_RB_PLAY_MS_NUM 15UL
#define ST_RB_REC_MS_NUM 8UL
Depending on the use case load and the PC responsiveness, they can be tune down to:
#define ST_RB_PLAY_MS_NUM 5UL
#define ST_RB_REC_MS_NUM 3UL
Which allowed us to measure in the range of [4.3ms - 5.6ms].